[bgpmon]: Fix exception in bgpmon caused by duplicate bgp neighbor ID#6546
[bgpmon]: Fix exception in bgpmon caused by duplicate bgp neighbor ID#6546gechiang merged 3 commits intosonic-net:masterfrom
Conversation
It is possible that BGP neighbors in IPv4 and IPv6 address families share the same name (such as bgp monitor). However, such case is not handled in bgpmon, and an Exception will be raised. This commit will address the issue. Signed-off-by: bingwang <bingwang@microsoft.com>
|
@gechiang , I think bgpmon still missing unit test. please add unit test to protect such behavior. |
|
@bingwang-ms Thanks for triaging this issue. I have a very basic question. Why is this "11.0.0.1" Neighbor being reported by FRR as part of the IPV6 unicast neighbor? Isn't this not a valid behavior that needs to be addressed first by FRR? Will this be investigated or raised as an issue with BGP FRR community? But in any case we need to beef up our code to prevent this from causing an issue. As for the fix I am thinking we should convert the "peer list" to a "peer set". this way the duplicated entry(ies) will only be in the set ONCE. If we do this it will also avoid causing the STATE DB issue since we no longer have duplicated peers. Let's not try to append ip version to the key to solve the initial issue. Let me know if my suggestion is sound? Let me know if you agree with my suggested approach... |
Thanks @gechiang . I'm not sure if the behavior is triggered by FRR or our scripts. So I filed a issue here. |
@gechiang we use v4 neighbor peering to advertise v6 routes. |
|
@abdosi Thanks for responding. So this is the same peer that got reported via IPV4 unicast as well as IPV6 unicast. Therefore one of this should be ignored. This come back to the suggestion on how we should address this duplication. |
Signed-off-by: bingwang <bingwang@microsoft.com>
|
Updated. Thanks |
src/sonic-bgpcfgd/bgpmon/bgpmon.py
Outdated
| self.flush_pipe(data) | ||
| # Save the new List | ||
| self.peer_l = self.new_peer_l[:] | ||
| # Save the new dict |
There was a problem hiding this comment.
Overall the change looks great. Just the comment here needs to be changed from "dict" to "set" to avoid confusion. Nice work overall! Thanks for the fix!
There was a problem hiding this comment.
Thanks. Updated.
So I'll close #6547. I think it's not an issue. Correct?
There was a problem hiding this comment.
@bingwang-ms Yes. #6547 is not an issue. Your PR fix will not allow duplicated entry in STATE DB to occur.
Thanks!
Signed-off-by: bingwang <bingwang@microsoft.com>
|
Retest this, please |
|
retest vsimage please |
|
/Azurepipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…#6546) * Fix exception in bgpmon caused by duplicate keys It is possible that BGP neighbors in IPv4 and IPv6 address families share the same name (such as bgp monitor). However, such case is not handled in bgpmon, and an Exception will be raised. This commit will address the issue by Using set instead of list to avoid duplicate keys.
…#6546) * Fix exception in bgpmon caused by duplicate keys It is possible that BGP neighbors in IPv4 and IPv6 address families share the same name (such as bgp monitor). However, such case is not handled in bgpmon, and an Exception will be raised. This commit will address the issue by Using set instead of list to avoid duplicate keys.
Signed-off-by: bingwang bingwang@microsoft.com
- Why I did it
It is possible that BGP neighbors in IPv4 and IPv6 address families share the same name.
For example, if belowing config is applied to creare a bgp monitor:
Then a bgp neighbor named
11.0.0.1will exist in both ipv4 and ipv6 neighbors.However, such case is not handled in
bgpmon.py, and an Exception will be raised because the same key is deleted twice.This commit will address the issue.
- How I did it
Use
setinstead oflistto store neighbors' ID to avoid duplicate keys.Please be noted that this PR only fix exception in
bgpmon.py. Similar issue exists inSTATE_DBbecauseNEIGH_STATE_TABLE|x.x.x.xis used as keys inSTATE_DB. If neighbors have duplicated keys, the value will be overwriten.- How to verify it
Verified on A7260. No exception is observed after update.
- Which release branch to backport (provide reason below if selected)
- Description for the changelog
No.
- A picture of a cute animal (not mandatory but encouraged)